Direct and Transposed Sparse Matrix-Vector Multiplication
نویسندگان
چکیده
In this paper we investigate the execution of Ab and AT b, where A is a sparse matrix and b a dense vector, using the Blocked Based Compression Storage (BBCS) scheme and an Augmented Vector Architecture (AVA). In particular, we demonstrate that by using the BBCS format, we can represent both the direct and the transposed matrix for the purposes of matrix-vector multiplication with no additional costs in storage, access time and computation performance. To achieve this, we propose a new instruction and a hardware modification for the AVA. Subsequently we evaluate the performance of the transposed Sparse Matrix Vector Multiplication (SMVM) and demonstrate that like for the direct SMVM, the BBCS scheme outperforms other general schemes like the Jagged Diagonal (JD) and the Compressed Row Storage (CRS) by 1.7 to 4.1 times. Furthermore we show that the BBCS scheme outperforms CRS and JD when the aforementioned SMVM is used in the Conjugate Gradient and Bi-Conjugate Gradient iterative solve algorithms for which speedups of 1.78 to 4.13 depending were achieved in simulations.
منابع مشابه
SIMD Parallel Sparse Matrix-Vector and Transposed-Matrix-Vector Multiplication in DD Precision
We accelerate a double precision sparse matrix and DD vector multiplication (DD-SpMV), and its transposition and DD vector multiplication (DD-TSpMV) by using SIMD AVX2 for Krylov subspace methods. We compare some storage formats of DD-SpMV and DDTSpMV for AVX2 to eliminate performance degradation factors in CRS. Our experience indicates that BCRS4x1, with fitting block size to the SIMD register...
متن کاملEfficient multithreaded untransposed, transposed or symmetric sparse matrix-vector multiplication with the Recursive Sparse Blocks format
In earlier work we have introduced the “Recursive Sparse Blocks” (RSB) sparse matrix storage scheme oriented towards cache efficient matrix-vector multiplication (SpMV ) and triangular solution (SpSV ) on cache based shared memory parallel computers. Both the transposed (SpMV T ) and symmetric (SymSpMV ) matrix-vector multiply variants are supported. RSB stands for a meta-format: it recursively...
متن کاملSparse Matrix-vector Multiplication on Nvidia Gpu
In this paper, we present our work on developing a new matrix format and a new sparse matrix-vector multiplication algorithm. The matrix format is HEC, which is a hybrid format. This matrix format is efficient for sparse matrix-vector multiplication and is friendly to preconditioner. Numerical experiments show that our sparse matrix-vector multiplication algorithm is efficient on
متن کاملPreconditioned Conjugate Gradient
.. ................................................................................................................... ix Chapter 1. Introduction ..................................................................................................1 Chapter 2. Background ..................................................................................................6 2.1. Matrix Compu...
متن کاملSparse Data Structures for Weighted Bipartite Matching
Inspired by the success of blocking to improve the performance of algorithms for sparse matrix vector multiplication [5] and sparse direct factorization [1], we explore the benefits of blocking in related sparse graph algorithms. A natural question is whether the benefits of local blocking extend to other sparse graph algorithms. Here we examine algorithms for finding a maximum-weight complete ...
متن کامل